Training YOLOv3 Models for Object Detection

Publications

The following publications provide additional information about YOLOv3.

You Only Look Once: Unified, Real-Time Object Detection (https://arxiv.org/abs/1506.02640)
YOLOv3: An Incremental Improvement (https://arxiv.org/pdf/1804.02767.pdf)

Prerequisites

The following items are required for training YOLOv3 for object detection:

Training dataset(s) for the input. See Cropping Datasets for information about extracting a training dataset as a subset of the original data.
A target for the output, which must be a multi-ROI (see Labeling Multi-ROIs for Object Detection).

The following items are optional for training YOLOv3:

An ROI mask(s), for defining the working space for the model. For example, if you want to exclude some labeled areas from the training, then you would need a mask.
Note Refer to the topic Creating Mask ROIs for information about creating mask ROIs.

Labeling Multi-ROIs for Object Detection

Multi-ROIs that are used as the target output for object detection must meet the following conditions:

The multi-ROI must have the same geometry as the input training data.
All voxels contained within the input patches must be labeled to be considered during training and each instance of an object needs to be labeled. Patches that are not fully segmented will be ignored.
Note Applying a mask may limit the number of input patches that are processed.
Label 1 must contain a fully labeled background. Other labels must contain the object classes to detect. For example, the multi-ROI for a kidney detection model may contain the following classes:
Note You should note that 'n' class models require multi-ROIs with n+1 labels.

You should note that object detection models require boxes as an input. Dragonfly automatically internally converts each multi-ROI island into a box so that both segmentations shown below are valid training outputs. You should also note that the background is fully labeled.

Segmentation options for object detection

You can use any of the segmentation tools available on the ROI Painter and ROI Tools panels to label the voxels of a multi-ROI (see ROI Painter and ROI Tools).

Generating New Models

Patch size, class count, and input count can be selected in the Model Information dialog whenever you generate a new YOLOv3 model.

Click the New button on the Train tab to open the dialog, shown below.

Model Information dialog

Model parameters
	Description
Patch size	During training, training data is split into smaller 2D data patches, which is defined by the 'Patch size' parameter. For example, if you choose a Patch size of 256, the dataset will be cut into sub-sections of 256 x 256 pixels. These subsections will then be used for training. By subdividing images, each pass or 'epoch' should be faster and use less memory. However, patches should be large enough to fully enclose the object(s) to be detected. Note Smaller patch sizes generally yield more truncated boxes resulting from boundary artifacts. You should use the biggest possible patch size for the best results
Class count	Class count is the number of object classes to detect. For example, a kidney detection model usually has two classes — 'Left Kidney' and 'Right Kidney'. Note The entered Class count should not include the background labeled in the multi-ROI. A background class is added automatically. For example, if you choose '4' as the class count, then the classes that appear on the Train tab will be: Background, Class 1, Class 2, Class 3, and Class 4.
Input count	Input count is number of channels in the input dataset — '1' for grayscale and '3' for color images with red, green, and blue channels.

How to Generate a New Model

Choose Artificial Intelligence > Custom Deep Model Architectures > YOLOv3 on the menu bar.
The YOLOv3 dialog appears.
Click the New button on the Train tab.
The Model Information dialog appears.
Enter a name and description for the new model, as required.
Choose the required Patch size in the drop-down menu, as shown below.
Recommendation Smaller patch sizes generally yield more truncated boxes resulting from boundary artifacts. You should use the biggest possible patch size for the best results.
Choose the required Class count, as shown below.
Note The entered Class count should not include the background labeled in the output multi-ROI. A background class is added automatically. For example, if you choose '4' as the class count, then the classes that appear on the Train tab will be: Background, Class 1, Class 2, Class 3, and Class 4.
Choose the Input count, as shown below.
Input count is the number of channels in the input dataset — '1' for grayscale and '3' for color images with red, green, and blue channels.
Click OK.
After processing is complete, the model appears in the Model list.
Note A background class is assigned automatically and your labeled classes in the output multi-ROI must match this.
Continue to the topic Training Models for Object Detection to learn how to train your new model.

Training Models for Object Detection

You can start training a YOLOv3 model for object detection after you have prepared your training input(s) and output(s), as well as any required masks (see Prerequisites).

The initial set of training parameters should work well in most cases. You should also note that the training parameters and weights are saved with each model and can be reused for future training.

How to Train a Model for Object Detection

Open the YOLOv3 dialog, if it is not already onscreen.
To open the dialog, choose Artificial Intelligence > Custom Deep Model Architecture > YOLOv3 on the menu bar.
Do one of the following, as required:
- Generate a new model for object detection (see Generating New Models).
- Select an untrained or trained model from the Model list that contains the required number of classes and inputs.
  Note In this case, you will need to click the Load button to load the selected model.
Rename the classes and/or assign new colors to the classes, optional.
Do the following in the Inputs box for each set of training data that you want to train the model with:
- Choose your training dataset in the Input drop-down menu.
  Note If your model requires multiple inputs, for example when working with color images, select the additional input(s), as required.
- Choose the labeled multi-ROI in the Output drop-down menu.
  Note Only multi-ROIs with the number of classes corresponding to model's class count and that have the same geometry as the input dataset will be available in the menu.
- Choose a mask in the Mask drop-down menu, optional.
Note If you are training with multiple training sets, click the Add New button and then choose the required input(s), output, and mask for the additional item(s).
The completed Inputs should look something like this:
Adjust the Data augmentation settings, as required (see Data Augmentation Settings).
In most cases, the default values should give good results.
Adjust the Training parameters, as required (see Training Parameters).
Note You should monitor the estimated memory ratio when you choose the training parameter settings. The ratio should not exceed 1.00 (see Estimated memory ratio).
Click the Train button.
The dataset is validated and then automatically split into training and validation sets before training begins. You can monitor the progress of training in the Training Model dialog, as shown below.
During training, the quantities 'loss' and 'val_loss' should decrease. You should continue to train until 'val_loss' stops decreasing.
Note You can also click the List tab and then view the training metric values for each epoch.
Wait for training the completed. You can also stop training at anytime, if required.
Evaluate the results for the training session, recommended (see Evaluating Training Results).
Generate previews of the test set to evaluate the model, recommended (see Previewing Training Results).
If the results are not satisfactory, you should consider doing one or more of the following and then retraining the model:
- Add an additional training set.
- Create a mask centered on problematic areas (see Creating Mask ROIs).
- Adjust the data augmentation settings (see Data Augmentation Settings).
- Adjust the training parameter settings (see Training Parameters).
When the model is trained satisfactorily, click the Save button to save your object detection model.
Apply the model to the original dataset or to similar datasets (see Applying Object Detection Models), as required.

Previewing Training Results

You can preview the results of training on an image slice of a selected dataset with the options in the Preview box, shown below.

Preview

How to Generate a Preview

Choose to show or hide a class by changing the visibility of a class in the Classes box, optional.
Select the dataset that you want the model to be applied to in the drop-down menu.
Click the Apply button.
The preview appears in the selected view.